Spatio-Channel Attention Blocks for Cross-modal Crowd Counting
نویسندگان
چکیده
AbstractCrowd counting research has made significant advancements in real-world applications, but it remains a formidable challenge cross-modal settings. Most existing methods rely solely on the optical features of RGB images, ignoring feasibility other modalities such as thermal and depth images. The inherently differences between different diversity design choices for model architectures make crowd more challenging. In this paper, we propose Cross-modal Spatio-Channel Attention (CSCA) blocks, which can be easily integrated into any modality-specific architecture. CSCA blocks first spatially capture global functional correlations among multi-modality with less overhead through spatial-wise attention. spatial attention are subsequently refined adaptive channel-wise feature aggregation. our experiments, proposed block consistently shows performance improvement across various backbone networks, resulting state-of-the-art results RGB-T RGB-D counting.KeywordsCrowd countingCross-modalAttention
منابع مشابه
Cross-modal attention enhances perceived contrast.
Visual Attention Each time we open our eyes, we are confronted with an overwhelming amount of information. How is it possible, then, that we still have a strong impression that we understand what we see? Visual attention is the mechanism that turns looking into seeing, allowing us to select a certain location or aspect of a busy visual scene, and prioritize its processing. Such selection is nec...
متن کاملCross-modal decoupling in temporal attention.
Prior studies have repeatedly reported behavioural benefits to events occurring at attended, compared to unattended, points in time. It has been suggested that, as for spatial orienting, temporal orienting of attention spreads across sensory modalities in a synergistic fashion. However, the consequences of cross-modal temporal orienting of attention remain poorly understood. One challenge is th...
متن کاملWhen cross-modal spatial attention fails.
There is now convincing evidence that an involuntary shift of spatial attention to a stimulus in one modality can affect the processing of stimuli in other modalities, but inconsistent findings across different paradigms have led to controversy. Such inconsistencies have important implications for theories of cross-modal attention. The authors investigated why orienting attention to a visual ev...
متن کاملCross-modal cuing and selective attention
Experiments on cuing have long provided insights into the mechanisms of selective attention. A visual cue presented in a particular location can enhance subsequent visual discriminations at that location, making them faster, or more accurate, or both. The standard interpretation of such experiments is that the cue attracts attention. Subsequent stimuli at that location are then more likely to b...
متن کاملDepth Information Guided Crowd Counting for Complex Crowd Scenes
It is important to monitor and analyze crowd events for the sake of city safety. In an EDOF (extended depth of field) image with a crowded scene, the distribution of people is highly imbalanced. People far away from the camera look much smaller and often occlude each other heavily, while people close to the camera look larger. In such a case, it is difficult to accurately estimate the number of...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Lecture Notes in Computer Science
سال: 2023
ISSN: ['1611-3349', '0302-9743']
DOI: https://doi.org/10.1007/978-3-031-26284-5_2